Goto

Collaborating Authors

 convolutional autoencoder



Discovering Operational Patterns Using Image-Based Convolutional Clustering and Composite Evaluation: A Case Study in Foundry Melting Processes

arXiv.org Artificial Intelligence

Industrial process monitoring increasingly relies on sensor-generated time-series data, yet the lack of labels, high variability, and operational noise make it difficult to extract meaningful patterns using conventional methods. Existing clustering techniques either rely on fixed distance metrics or deep models designed for static data, limiting their ability to handle dynamic, unstructured industrial sequences. Addressing this gap, this paper proposes a novel framework for unsupervised discovery of operational modes in univariate time-series data using image-based convolutional clustering with composite internal evaluation. The proposed framework improves upon existing approaches in three ways: (1) raw time-series sequences are transformed into grayscale matrix representations via overlapping sliding windows, allowing effective feature extraction using a deep convolutional autoencoder; (2) the framework integrates both soft and hard clustering outputs and refines the selection through a two-stage strategy; and (3) clustering performance is objectively evaluated by a newly developed composite score, S_eva, which combines normalized Silhouette, Calinski-Harabasz, and Davies-Bouldin indices. Applied to over 3900 furnace melting operations from a Nordic foundry, the method identifies seven explainable operational patterns, revealing significant differences in energy consumption, thermal dynamics, and production duration. Compared to classical and deep clustering baselines, the proposed approach achieves superior overall performance, greater robustness, and domain-aligned explainability. The framework addresses key challenges in unsupervised time-series analysis, such as sequence irregularity, overlapping modes, and metric inconsistency, and provides a generalizable solution for data-driven diagnostics and energy optimization in industrial systems.


Attention-Enhanced Convolutional Autoencoder and Structured Delay Embeddings for Weather Prediction

arXiv.org Artificial Intelligence

Weather prediction is a quintessential problem involving the forecasting of a complex, nonlinear, and chaotic high-dimensional dynamical system. This work introduces an efficient reduced-order modeling (ROM) framework for short-range weather prediction and investigates fundamental questions in dimensionality reduction and reduced order modeling of such systems. Unlike recent AI-driven models, which require extensive computational resources, our framework prioritizes efficiency while achieving reasonable accuracy. Specifically, a ResNet-based convolutional autoencoder augmented by block attention modules is developed to reduce the dimensionality of high-dimensional weather data. Subsequently, a linear operator is learned in the time-delayed embedding of the latent space to efficiently capture the dynamics. Using the ERA5 reanalysis dataset, we demonstrate that this framework performs well in-distribution as evidenced by effectively predicting weather patterns within training data periods. We also identify important limitations in generalizing to future states, particularly in maintaining prediction accuracy beyond the training window. Our analysis reveals that weather systems exhibit strong temporal correlations that can be effectively captured through linear operations in an appropriately constructed embedding space, and that projection error rather than inference error is the main bottleneck. These findings shed light on some key challenges in reduced-order modeling of chaotic systems and point toward opportunities for hybrid approaches that combine efficient reduced-order models as baselines with more sophisticated AI architectures, particularly for applications in long-term climate modeling where computational efficiency is paramount.


A Real-Time BCI for Stroke Hand Rehabilitation Using Latent EEG Features from Healthy Subjects

arXiv.org Artificial Intelligence

This study presents a real-time, portable brain-computer interface (BCI) system designed to support hand rehabilitation for stroke patients. The system combines a low cost 3D-printed robotic exoskeleton with an embedded controller that converts brain signals into physical hand movements. EEG signals are recorded using a 14-channel Emotiv EPOC+ headset and processed through a supervised convolutional autoencoder (CAE) to extract meaningful latent features from single-trial data. The model is trained on publicly available EEG data from healthy individuals (WAY-EEG-GAL dataset), with electrode mapping adapted to match the Emotiv headset layout. Among several tested classifiers, Ada Boost achieved the highest accuracy (89.3%) and F1-score (0.89) in offline evaluations. The system was also tested in real time on five healthy subjects, achieving classification accuracies between 60% and 86%. The complete pipeline - EEG acquisition, signal processing, classification, and robotic control - is deployed on an NVIDIA Jetson Nano platform with a real-time graphical interface. These results demonstrate the system's potential as a low-cost, standalone solution for home-based neurorehabilitation.


DeepBoost-AF: A Novel Unsupervised Feature Learning and Gradient Boosting Fusion for Robust Atrial Fibrillation Detection in Raw ECG Signals

arXiv.org Artificial Intelligence

Atrial fibrillation (AF) is a prevalent cardiac arrhythmia associated with elevated health risks, where timely detection is pivotal for mitigating stroke-related morbidity. This study introduces an innovative hybrid methodology integrating unsupervised deep learning and gradient boosting models to improve AF detection. A 19-layer deep convolutional autoencoder (DCAE) is coupled with three boosting classifiers-AdaBoost, XGBoost, and LightGBM (LGBM)-to harness their complementary advantages while addressing individual limitations. The proposed framework uniquely combines DCAE with gradient boosting, enabling end-to-end AF identification devoid of manual feature extraction. The DCAE-LGBM model attains an F1-score of 95.20%, sensitivity of 99.99%, and inference latency of four seconds, outperforming existing methods and aligning with clinical deployment requirements. The DCAE integration significantly enhances boosting models, positioning this hybrid system as a reliable tool for automated AF detection in clinical settings.



Quantum Circuits for Quantum Convolutions: A Quantum Convolutional Autoencoder

arXiv.org Artificial Intelligence

Quantum machine learning deals with leveraging quantum theory with classic machine learning algorithms. Current research efforts study the advantages of using quantum mechanics or quantum information theory to accelerate learning time or convergence. Other efforts study data transformations in the quantum information space to evaluate robustness and performance boosts. This paper focuses on processing input data using randomized quantum circuits that act as quantum convolutions producing new representations that can be used in a convolutional network. Experimental results suggest that the performance is comparable to classic convolutional neural networks, and in some instances, using quantum convolutions can accelerate convergence.


MRExtrap: Longitudinal Aging of Brain MRIs using Linear Modeling in Latent Space

arXiv.org Artificial Intelligence

Simulating aging in 3D brain MRI scans can reveal disease progression patterns in neurological disorders such as Alzheimer's disease. Current deep learning-based generative models typically approach this problem by predicting future scans from a single observed scan. We investigate modeling brain aging via linear models in the latent space of convolutional autoencoders (MRExtrap). Our approach, MRExtrap, is based on our observation that autoencoders trained on brain MRIs create latent spaces where aging trajectories appear approximately linear. We train autoencoders on brain MRIs to create latent spaces, and investigate how these latent spaces allow predicting future MRIs through linear extrapolation based on age, using an estimated latent progression rate $\boldsymbolβ$. For single-scan prediction, we propose using population-averaged and subject-specific priors on linear progression rates. We also demonstrate that predictions in the presence of additional scans can be flexibly updated using Bayesian posterior sampling, providing a mechanism for subject-specific refinement. On the ADNI dataset, MRExtrap predicts aging patterns accurately and beats a GAN-based baseline for single-volume prediction of brain aging. We also demonstrate and analyze multi-scan conditioning to incorporate subject-specific progression rates. Finally, we show that the latent progression rates in MRExtrap's linear framework correlate with disease and age-based aging patterns from previously studied structural atrophy rates. MRExtrap offers a simple and robust method for the age-based generation of 3D brain MRIs, particularly valuable in scenarios with multiple longitudinal observations.


Structural Connectome Harmonization Using Deep Learning: The Strength of Graph Neural Networks

arXiv.org Artificial Intelligence

Small sample sizes in neuroimaging in general, and in structural connectome (SC) studies in particular limit the development of reliable biomarkers for neurological and psychiatric disorders - such as Alzheimer's disease and schizophrenia - by reducing statistical power, reliability, and generalizability. Large-scale multi-site studies have exist, but they have acquisition-related biases due to scanner heterogeneity, compromising imaging consistency and downstream analyses. While existing SC harmonization methods - such as linear regression (LR), ComBat, and deep learning techniques - mitigate these biases, they often rely on detailed metadata, traveling subjects (TS), or overlook the graph-topology of SCs. To address these limitations, we propose a site-conditioned deep harmonization framework that harmonizes SCs across diverse acquisition sites without requiring metadata or TS that we test in a simulated scenario based on the Human Connectome Dataset. Within this framework, we benchmark three deep architectures - a fully connected autoencoder (AE), a convolutional AE, and a graph convolutional AE - against a top-performing LR baseline. While non-graph models excel in edge-weight prediction and edge existence detection, the graph AE demonstrates superior preservation of topological structure and subject-level individuality, as reflected by graph metrics and fingerprinting accuracy, respectively. Although the LR baseline achieves the highest numerical performance by explicitly modeling acquisition parameters, it lacks applicability to real-world multi-site use cases as detailed acquisition metadata is often unavailable. Our results highlight the critical role of model architecture in SC harmonization performance and demonstrate that graph-based approaches are particularly well-suited for structure-aware, domain-generalizable SC harmonization in large-scale multi-site SC studies.


The model is the message: Lightweight convolutional autoencoders applied to noisy imaging data for planetary science and astrobiology

arXiv.org Artificial Intelligence

The application of convolutional autoencoder deep learning to imaging data for planetary science and astrobiological use is briefly reviewed and explored with a focus on the need to understand algorithmic rationale, process, and results when machine learning is utilized. Successful autoencoders train to build a model that captures the features of data in a dimensionally reduced form (the latent representation) that can then be used to recreate the original input. One application is the reconstruction of incomplete or noisy data. Here a baseline, lightweight convolutional autoencoder is used to examine the utility for planetary image reconstruction or inpainting in situations where there is destructive random noise (i.e., either luminance noise with zero returned data in some image pixels, or color noise with random additive levels across pixel channels). It is shown that, in certain use cases, multi-color image reconstruction can be usefully applied even with extensive random destructive noise with 90% areal coverage and higher. This capability is discussed in the context of intentional masking to reduce data bandwidth, or situations with low-illumination levels and other factors that obscure image data (e.g., sensor degradation or atmospheric conditions). It is further suggested that for some scientific use cases the model latent space and representations have more utility than large raw imaging datasets.